-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a destination connector for nomicdb #175
base: main
Are you sure you want to change the base?
Create a destination connector for nomicdb #175
Conversation
Hi @potter-potter , Many thanks for your work at UnstructuredIO. This is my first contribution i.e. adding a nomicdb connector to Unstructured-Ingest. I hope you can give me a pointer or two to improve my Pull Request. |
@@ -0,0 +1,14 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like and IDE specific file, shouldn't be in this PR.
@@ -0,0 +1,80 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we've moved to adding integration tests rather than e2e tests for new connectors. Take a look at this s3 example: test_s3.py. This should help isolate chunk the connector code and make testing it easier.
@@ -0,0 +1,125 @@ | |||
import multiprocessing as mp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new connectors, this should live in the v2 directory using the new ingest framework: unstructured_ingest/v2/processes/connectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will change the general approach you've introduced here so I'll wait to review the actual connector files until that's been updated.
I've added the |
Description
While
unstructured-ingest
to process data, I would like to ingest data directly intonomic
AtlasDaaset and visualise data with its Atlas map.Key changes
1 test_e2e/python/test-ingest-nomicdb.py: a simple integrated demonstration of processing local files with unstructured api and ingesting into nomic map
2 unstructured_ingest/connector/nomicdb.py: use connector/qdrant.py as an example to implement this connector
Testing